Sructural Matching of Parallel Texts
نویسندگان
چکیده
This paper describes a method for finding struc-rural matching between parallel sentences of two languages, (such as Japanese and English). Parallel sentences are analyzed based on unification grammars, and structural matching is performed by making use of a similarity measure of word pairs in the two languages. Syntactic ambiguities are resolved simultaneously in the matching process. The results serve as a. useful source for extracting linguistic a.nd lexical knowledge.
منابع مشابه
How to Facilitate the Proof of Theorems by Using the Induction-matching, and by Generalization
In this paper, we show how we conceive the proof of theorems by sructural induction Our aim is to facilitate the proof of the theorems which can lead, in a context of automatic theorem proving, to very lengthy (or even impossible) proofs We use a very simple tool, the i-matching or inductionmatching, which allows us, on the one hand to define an original procedure of generalization, and on the ...
متن کاملA Method to Overcome Computer Word Size Limitation in Bit-Parallel Pattern Matching
The performance of the pattern matching algorithms based on bit-parallelism degrades when the input pattern length exceeds the computer word size. Although several divide-and-conquer methods have been proposed to overcome that limitation, the resulting schemes are not that much efficient and hard to implement. This study introduces a new fast bit-parallel pattern matching algorithm that is capa...
متن کاملAligning Noisy Parallel Corpora Across Language Groups : Word Pair Feature Matching by Dynamic Time Warping
We propose a new algorithm, DK-vec, for aligning pairs of Asian/Indo-European noisy parallel texts without sentence boundaries. The algorithm uses frequency, position and recency information as features for pattern matching. Dynamic Time Warping is used as the matching technique between word pairs. This algorithm produces a small bilingual lexicon which provides anchor points for alignment.
متن کاملReal-Time Identification of Parallel Texts
Parallel texts are documents that present parallel translations. This paper describes a simple method that can be deployed on a real-time news feed to create an infinitely growing source of parallel texts in French and English. Our experiment was lead on the Canada Newswire news feed. Given some of its intrinsic properties, it was possible to deploy a relatively simple text matching techniques ...
متن کاملParallel Overlap and Similarity Detection in Semi-Structured Document Collections
Proliferation of digital libraries plus high availability of electronic documents from the Internet have created new challenges for computer science researchers and professionals. This paper discusses the problems of using parallel and cluster computing systems for detecting plagiarism in large collections of semi-structured electronic texts, including software written in formal languages at on...
متن کامل